8 - Deep Learning [ID:12495]

50 von 566 angezeigt

So welcome everybody. Welcome back to our class in deep learning and today we want to look at

recurrent neural networks and how to process sequences. So yeah we'll have a look shortly

into the motivation why we need a different way of modeling for sequences when this is kind of

special. And then we look into some simple approaches, the simple recurrent

neural networks, and then some ideas how you can deal with longer sequences if

you want to introduce memory and things like that. We have two different

approaches to tackle this. Then we want to compare them and in the end look a

bit into other purposes of recurrent neural networks because they can also be

used for sampling. And with that you can then also do sequence generation. Okay,

so yeah, motivation. So far we had one input, could be multi-dimensional, one

input vector, but just a single type of input and then we were interested in one

single type of output. So we had this feed-forward processing chain where you

have input processing and result and that was quite useful in a broad variety of

applications. Now of course there's also many different kinds of inputs in

particular time dependent signals and if you think about music or speech or

videos or other sensory data they can be time dependent and one big problem that

you have with time dependent signals is that they often don't have the same

length. So if I speak fast, you're just just sitting there nodding, yeah, it's early in the morning.

Okay, just joking. Okay, then you missed all the nice things that I had to say about

Zip Hochreiter and Jürgen Schmidhuber. Good, so this is the actual setup and we see that we now

have still this concept of this state here and if you look at this, so remember this figure is also

something you want to prepare for the for the oral exam. You want to be able to draw this figure and

explain it, what's actually happening here. So you see that there is two memories now, there is some

H and some C and they are also time-dependent so they change over time. The first thing you notice,

H runs all the way here, then there is a non-linearity, then it somehow gets influenced

what comes from the cell state, but we have this long line here, a non-linearity and that essentially

produces the new state. So this is very similar to what we already know from the L-mon cell. We have

a state, some inputs that are associated with it and that produces in a non-linear way the new

state. So here we essentially see the L-mon cell and here this state then also produces the observation.

So this is not connected, this is just running through here, non-linearity. So you could say

this branch here and this output here is exactly the L-mon state, L-mon cell. So not that much new.

Well, there is quite a bit new here because we need all these additional symbols here and they are

associated to the cell state and the cell state is interesting because it gets multiplied with

something, it gets added to something, but there is no non-linearity here. This is completely linear

memory. Everything that is happening in the cell state is linear, either multiplication or addition.

And now let's see what's happening. There is essentially two things happening. Yeah, there

is a multiplication here and a addition here. And the first thing that is happening we have the input

and the current state and then there is a non-linearity. This is a sigmoid function,

meaning it produces values between 0 and 1.

Now if you multiply something with 1, what happens?

Not so exciting, nothing.

If you multiply something with 0, it's gone.

And this is a vector, C is a vector,

and now we have element-wise multiplication with 0's or 1.

This is why this guy is called F. This is a forget gate.

So this is used to kick out information out of the cell memory.

So we can delete specific entries.

We reset them so we can forget.

But if we can only forget, it's maybe not so useful.

So we have to also pick up something.

Teil einer Videoserie :

Deep Learning

Presenters

Prof. Dr. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:31:30 Min

Aufnahmedatum

2019-12-10

Hochgeladen am

2019-12-10 14:09:02

Sprache

en-US

Tags

Per RSS abonnieren